AITopics | British Virgin Islands

Collaborating Authors

British Virgin Islands

AI hallucinations found in high-profile Wall Street law firm filing

The GuardianApr-22-2026, 08:09:34 GMT

The elite Wall Street law firm Sullivan & Cromwell has told a court that a major filing it made in a high-profile case contained errors resulting from hallucinations generated by artificial intelligence. Andrew Dietderich, the co-head of the firm's global restructuring group, apologised in a letter to the New York federal judge Martin Glenn on Saturday for the string of mistakes, which included inaccurate citations. The errors, uncovered by the law firm Boies Schiller Flexner (BSF), which was also working on the case, included misquoting the US bankruptcy code and citing cases incorrectly in a filing made on 9 April. In multiple instances, S&C, which employs more than 900 lawyers and has one of the top reputations for corporate work in the US, filed inaccurately summarised conclusions made in other cases using AI. "We deeply regret that this has occurred," said Dietderich in the letter.

artificial intelligence, navigation close dialogue 1 1, social media, (10 more...)

The Guardian

Country:

North America > United States > New York > New York County > New York City (0.62)
Asia > China (0.16)
Europe > Ukraine (0.08)
(3 more...)

Industry:

Law (1.00)
Leisure & Entertainment > Sports (0.74)
Government > Regional Government > North America Government > United States Government (0.34)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.77)

Add feedback

WikiVideo: Article Generation from Multiple Videos

Martin, Alexander, Kriz, Reno, Walden, William Gantt, Sanders, Kate, Recknor, Hannah, Yang, Eugene, Ferraro, Francis, Van Durme, Benjamin

arXiv.org Artificial IntelligenceApr-1-2025

We present the challenging task of automatically creating a high-level Wikipedia-style article that aggregates information from multiple diverse videos about real-world events, such as natural disasters or political elections. Videos are intuitive sources for retrieval-augmented generation (RAG), but most contemporary RAG workflows focus heavily on text and existing methods for video-based summarization focus on low-level scene understanding rather than high-level event semantics. To close this gap, we introduce WikiVideo, a benchmark consisting of expert-written articles and densely annotated videos that provide evidence for articles' claims, facilitating the integration of video into RAG pipelines and enabling the creation of in-depth content that is grounded in multimodal sources. We further propose Collaborative Article Generation (CAG), a novel interactive method for article creation from multiple videos. CAG leverages an iterative interaction between an r1-style reasoning model and a VideoLLM to draw higher level inferences about the target event than is possible with VideoLLMs alone, which fixate on low-level visual features. We benchmark state-of-the-art VideoLLMs and CAG in both oracle retrieval and RAG settings and find that CAG consistently outperforms alternative methods, while suggesting intriguing avenues for future work.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.00939

Country:

Europe > France > Île-de-France > Paris > Paris (0.29)
North America > The Bahamas (0.14)
North America > United States > Georgia (0.14)
(43 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Fire & Emergency Services (1.00)
Government > Voting & Elections (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding

Cho, Sukmin, Choi, Sangjin, Hwang, Taeho, Seo, Jeongyeon, Jeong, Soyeong, Lee, Huije, Song, Hoyun, Park, Jong C., Kwon, Youngjin

arXiv.org Artificial IntelligenceFeb-8-2025

Accelerating inference in Large Language Models (LLMs) is critical for real-time interactions, as they have been widely incorporated into real-world services. Speculative decoding, a fully algorithmic solution, has gained attention for improving inference speed by drafting and verifying tokens, thereby generating multiple tokens in a single forward pass. However, current drafting strategies usually require significant fine-tuning or have inconsistent performance across tasks. To address these challenges, we propose Hierarchy Drafting (HD), a novel lossless drafting approach that organizes various token sources into multiple databases in a hierarchical framework based on temporal locality. In the drafting step, HD sequentially accesses multiple databases to obtain draft tokens from the highest to the lowest locality, ensuring consistent acceleration across diverse tasks and minimizing drafting latency. Our experiments on Spec-Bench using LLMs with 7B and 13B parameters demonstrate that HD outperforms existing database drafting methods, achieving robust inference speedups across model sizes, tasks, and temperatures.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.05609

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
North America > Cuba (0.04)
(19 more...)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Embedding Knowledge Graph in Function Spaces

Teyou, Louis Mozart Kamdem, Demir, Caglar, Ngomo, Axel-Cyrille Ngonga

arXiv.org Machine LearningSep-24-2024

We introduce a novel embedding method diverging from conventional approaches by operating within function spaces of finite dimension rather than finite vector space, thus departing significantly from standard knowledge graph embedding techniques. Initially employing polynomial functions to compute embeddings, we progress to more intricate representations using neural networks with varying layer complexities. We argue that employing functions for embedding computation enhances expressiveness and allows for more degrees of freedom, enabling operations such as composition, derivatives and primitive of entities representation. Additionally, we meticulously outline the step-by-step construction of our approach and provide code for reproducibility, thereby facilitating further exploration and application in the field.

dataset, fmult, relation, (13 more...)

arXiv.org Machine Learning

2409.14857

Country:

North America > United States > Idaho > Ada County > Boise (0.05)
North America > Belize (0.05)
Europe > Slovakia (0.05)
(20 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.65)

Add feedback

MIRAI: Evaluating LLM Agents for Event Forecasting

Ye, Chenchen, Hu, Ziniu, Deng, Yihe, Huang, Zijie, Ma, Mingyu Derek, Zhu, Yanqiao, Wang, Wei

arXiv.org Artificial IntelligenceJul-1-2024

Recent advancements in Large Language Models (LLMs) have empowered LLM agents to autonomously collect world information, over which to conduct reasoning to solve complex problems. Given this capability, increasing interests have been put into employing LLM agents for predicting international events, which can influence decision-making and shape policy development on an international scale. Despite such a growing interest, there is a lack of a rigorous benchmark of LLM agents' forecasting capability and reliability. To address this gap, we introduce MIRAI, a novel benchmark designed to systematically evaluate LLM agents as temporal forecasters in the context of international events. Our benchmark features an agentic environment with tools for accessing an extensive database of historical, structured events and textual news articles. We refine the GDELT event database with careful cleaning and parsing to curate a series of relational prediction tasks with varying forecasting horizons, assessing LLM agents' abilities from short-term to long-term forecasting. We further implement APIs to enable LLM agents to utilize different tools via a code-based interface. In summary, MIRAI comprehensively evaluates the agents' capabilities in three dimensions: 1) autonomously source and integrate critical information from large global databases; 2) write codes using domain-specific APIs and libraries for tool-use; and 3) jointly reason over historical knowledge from diverse formats and time to accurately predict future events. Through comprehensive benchmarking, we aim to establish a reliable framework for assessing the capabilities of LLM agents in forecasting international events, thereby contributing to the development of more accurate and trustworthy models for international relation analysis.

cameocode, isocode, relation, (15 more...)

arXiv.org Artificial Intelligence

2407.01231

Country:

Asia > North Korea (0.14)
Oceania > Australia > Australian Indian Ocean Territories > Territory of Cocos (Keeling) Islands (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(234 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Law (1.00)
Government > Foreign Policy (1.00)
Government > Military (0.93)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Digital Divides in Scene Recognition: Uncovering Socioeconomic Biases in Deep Learning Systems

Greene, Michelle R., Josyula, Mariam, Si, Wentao, Hart, Jennifer A.

arXiv.org Artificial IntelligenceJan-23-2024

Computer-based scene understanding has influenced fields ranging from urban planning to autonomous vehicle performance, yet little is known about how well these technologies work across social differences. We investigate the biases of deep convolutional neural networks (dCNNs) in scene classification, using nearly one million images from global and US sources, including user-submitted home photographs and Airbnb listings. We applied statistical models to quantify the impact of socioeconomic indicators such as family income, Human Development Index (HDI), and demographic factors from public data sources (CIA and US Census) on dCNN performance. Our analyses revealed significant socioeconomic bias, where pretrained dCNNs demonstrated lower classification accuracy, lower classification confidence, and a higher tendency to assign labels that could be offensive when applied to homes (e.g., "ruin", "slum"), especially in images from homes with lower socioeconomic status (SES). This trend is consistent across two datasets of international images and within the diverse economic and racial landscapes of the United States. This research contributes to understanding biases in computer vision, emphasizing the need for more inclusive and representative training datasets. By mitigating the bias in the computer vision pipelines, we can ensure fairer and more equitable outcomes for applied computer vision, including home valuation and smart home security systems. There is urgency in addressing these biases, which can significantly impact critical decisions in urban development and resource allocation. Our findings also motivate the development of AI systems that better understand and serve diverse communities, moving towards technology that equitably benefits all sectors of society.

classification, classification entropy, dataset, (13 more...)

arXiv.org Artificial Intelligence

2401.13097

Country:

North America > United States (0.67)
Oceania > Samoa (0.04)
Oceania > Pitcairn (0.04)
(204 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Smart Houses & Appliances (0.54)
Health & Medicine > Public Health (0.48)
Banking & Finance > Economy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance

Pi, Renjie, Han, Tianyang, Xie, Yueqi, Pan, Rui, Lian, Qing, Dong, Hanze, Zhang, Jipeng, Zhang, Tong

arXiv.org Artificial IntelligenceJan-17-2024

The deployment of multimodal large language models (MLLMs) has brought forth a unique vulnerability: susceptibility to malicious attacks through visual inputs. We delve into the novel challenge of defending MLLMs against such attacks. We discovered that images act as a "foreign language" that is not considered during alignment, which can make MLLMs prone to producing harmful responses. Unfortunately, unlike the discrete tokens considered in text-based LLMs, the continuous nature of image signals presents significant alignment challenges, which poses difficulty to thoroughly cover the possible scenarios. This vulnerability is exacerbated by the fact that open-source MLLMs are predominantly fine-tuned on limited image-text pairs that is much less than the extensive text-based pretraining corpus, which makes the MLLMs more prone to catastrophic forgetting of their original abilities during explicit alignment tuning. To tackle these challenges, we introduce MLLM-Protector, a plug-and-play strategy combining a lightweight harm detector and a response detoxifier. The harm detector's role is to identify potentially harmful outputs from the MLLM, while the detoxifier corrects these outputs to ensure the response stipulates to the safety standards. This approach effectively mitigates the risks posed by malicious visual inputs without compromising the model's overall performance. Our results demonstrate that MLLM-Protector offers a robust solution to a previously unaddressed aspect of MLLM security.

language model, mllm-protector, preprint arxiv, (13 more...)

arXiv.org Artificial Intelligence

2401.02906

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Illinois (0.04)
North America > Panama (0.04)
(6 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Ask Me Anything: A simple strategy for prompting language models

Arora, Simran, Narayan, Avanika, Chen, Mayee F., Orr, Laurel, Guha, Neel, Bhatia, Kush, Chami, Ines, Sala, Frederic, Ré, Christopher

arXiv.org Artificial IntelligenceNov-19-2022

Large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt that demonstrates how to perform the task and no additional training. Prompting is a brittle process wherein small modifications to the prompt can cause large variations in the model predictions, and therefore significant effort is dedicated towards designing a painstakingly "perfect prompt" for a task. To mitigate the high degree of effort involved in prompt-design, we instead ask whether producing multiple effective, yet imperfect, prompts and aggregating them can lead to a high quality prompting strategy. Our observations motivate our proposed prompting method, ASK ME ANYTHING (AMA). We first develop an understanding of the effective prompt formats, finding that question-answering (QA) prompts, which encourage open-ended generation ("Who went to the park?") tend to outperform those that restrict the model outputs ("John went to the park. Output True or False."). Our approach recursively uses the LLM itself to transform task inputs to the effective QA format. We apply the collected prompts to obtain several noisy votes for the input's true label. We find that the prompts can have very different accuracies and complex dependencies and thus propose to use weak supervision, a procedure for combining the noisy predictions, to produce the final predictions for the inputs. We evaluate AMA across open-source model families (e.g., EleutherAI, BLOOM, OPT, and T0) and model sizes (125M-175B parameters), demonstrating an average performance lift of 10.2% over the few-shot baseline. This simple strategy enables the open-source GPT-J-6B model to match and exceed the performance of few-shot GPT3-175B on 15 of 20 popular benchmarks. Averaged across these tasks, the GPT-J-6B model outperforms few-shot GPT3-175B. We release our code here: https://github.com/HazyResearch/ama_prompting

large language model, machine learning, question generation, (21 more...)

arXiv.org Artificial Intelligence

2210.02441

Country:

North America > United States > New Jersey (0.14)
Africa > Middle East > Libya (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.14)
(83 more...)

Genre:

Research Report (1.00)
Personal (0.92)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground (1.00)
Transportation > Air (1.00)
(20 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TuSimple Co-Founder Takes Control of Self-Driving Trucking Company

WSJ.com: WSJD - TechnologyNov-17-2022, 02:23:00 GMT

TuSimple Holdings Inc. co-founder Mo Chen has taken control of the self-driving trucking company as federal authorities continue to investigate TuSimple's relationship with Mr. Chen's other startup, a Chinese hydrogen-trucking company. A TuSimple filing with the Securities and Exchange Commission on Wednesday shows that Mr. Chen has 59% of the voting power at the San Diego-based company, giving him control as of Nov. 9, a day before the company announced it had ousted its board of directors. Mr. Chen acquired the stake through stock purchases using his family trust and British Virgin Islands-based entities, according to the securities filing. TuSimple's newly appointed chief executive officer, Cheng Lu, said, "We have a strong sense of urgency to put our company back on track and regain trust from all stakeholders." A weekly digest of tech reviews, headlines, columns and your questions answered by WSJ's Personal Tech gurus.

chen, tusimple, tusimple co-founder take control, (11 more...)

WSJ.com: WSJD - Technology

Country:

North America > United States > California > San Diego County > San Diego (0.25)
North America > British Virgin Islands (0.25)
Asia > China (0.08)

Genre: Press Release (1.00)

Industry:

Banking & Finance > Trading (1.00)
Transportation > Ground > Road (0.83)
Transportation > Freight & Logistics Services (0.83)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.61)

Add feedback

The AI 'gold rush' in Washington

#artificialintelligenceJul-3-2022, 07:11:44 GMT

AI's little guys are getting into the Washington influence game. Tech giants and defense contractors have long dominated AI lobbying, seeking both money and favorable rules. And while the largest companies still dominate the debate, pending legislation in Congress aimed at getting ahead of China on innovation, along with proposed bills on data privacy, have caused a spike in lobbying by smaller AI players. A number of companies focused on robotics, drones and self-driving cars are all setting up their own Washington influence machines, positioning them to shape the future of AI policy to their liking. A lot of it is spurred by one major piece of legislation: The Bipartisan Innovation Act, commonly referred to as USICA -- an acronym for its previous title, and its goal to out-innovate China.

ericsson, ether, facial recognition, (15 more...)

#artificialintelligence

Country:

Asia > China (0.46)
North America > United States > Texas > Denton County > Lewisville (0.05)
North America > British Virgin Islands (0.05)
Europe (0.05)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance > Trading (1.00)
Law > Statutes (0.91)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.50)

Add feedback